Model Selection

Visual-Text Conversion

# Visual-Text Conversion

Uae License Detection

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder to process document images

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder for image-to-text conversion

This is an image captioning model that generates plot descriptions from movie/TV show posters. It produces decent plot summaries, though far from perfect. We are continuously improving the model.

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase